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Abstract. A road map can be interpreted as a graph embedded in the plane, in 
which each vertex corresponds to a road junction and each edge to a particular 
road section. We consider the cartographic problem to place non-overlapping road 
labels along the edges so that as many road sections as possible are identified by 
their name, i.e., covered by a label. We show that this is NP-hard in general, but 
the problem can be solved in polynomial time if the road map is an embedded tree. 


1 Introduction 


Map labeling is a well-known cartographic problem in computational geometry El 
Chapter 58.3.1], |13|. Depending on the type of map features, one can distinguish 
labeling of points, lines, and areas. Common cartographic quality criteria are that labels 
must be disjoint and clearly identify their respective map features 0 - Most of the 
previous work concerns point labeling, while labeling line and area features received 
considerably less attention. In this paper we address labeling linear features, namely 
roads in a road map. 

Geometrically, a road map is the representation of a road graph G as an arrangement 
of fat curves in the plane Each road is a connected subgraph of G (typically a simple 
path) and each edge belongs to exactly one road. Roads may intersect each other in 
junctions, the vertices of G, and we denote an edge connecting two junctions as a road 
section. In road labeling the task is to place the road names inside the fat curves so that 
the road sections are identified unambiguously, see Fig. 

Chirie Q presented a set of rules and quality criteria for label placement in road 
maps based on interviews with cartographers. This includes that (Cl) labels are placed 
inside and parallel to the road shapes, (C2) every road section between two junctions 
should be clearly identified, and (C3) no two road labels may intersect. Further, he 
gave a mathematical description for labeling a single road and introduced a heuristic 
for sequentially labeling all roads in the map. Imhof’s foundational cartographic work 
on label positioning in maps lists very similar quality criteria Q. Edmondson et al. Q 
took an algorithmic perspective on labeling a single linear feature (such as a river). 
While Edmondson et al. considered non-bent labels, Wolff et al. E) introduced an 
algorithm for single linear feature that places labels following the curvature of the linear 
feature. Strijk |j9| considered static road labeling with embedded labels and presented a 
heuristic for selecting non-overlapping labels out of a set of label candidates. Seibert and 
Unger Q considered grid-shaped road networks. They showed that in those networks it 
is NP-complete and APX-hard to decide whether for every road at least one label can 
be placed. Yet, Neyer and Wagner Q introduced a practically efficient algorithm that 
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Fig. 1. a-b): Two ways to label the same road network. Each road section has its own color. 
Junctions are marked gray. Fig. b) identifies all road sections, c) Illustration of the road graph and 
relevant terms. 


finds such a grid labeling if possible. Maass and Dollner Q presented a heuristic for 
labeling the roads of an interactive 3D map with objects (such as buildings). Apart from 
label-label overlaps, they also resolve label-object occlusions. Vaaraniem et al. 1101 used 
a force-based labeling algorithm for 2D and 3D scenes including road label placement. 


Contribution. While in grid-shaped road networks it is sufficient to place a single 
label per road to clearly identify all Its road sections, this is not the case in general road 
networks. Consider the example in Fig.[2 In Fig.[^), it is not obvious whether the orange 
road section in the center belongs to Knuth St. or to Turing St. Simply maximizing the 
number of placed labels, as often done for labeling point features, can cause undesired 
effects like unnamed roads or clumsy label placements (e.g., around Dijkstra St. and 
Hamming St. in Fig.[T^)). Therefore, in contrast to Seibert and Unger Q, we aim for 
maximizing the number of identified road sections, i.e., road sections that can be clearly 
assigned to labels; see Fig.[^). 

Based on criteria (C1)-(C3) we introduce a new and versatile model for road labeling 
in Section In Section we show that the problem of maximizing the number of 
identified road sections is N P-hard for general road graphs, even if each road is a path. 
For the special case that the road graph is a tree, we present a polynomial-time algorithm 
in Section]^ This special case is not only of theoretical Interest, but our algorithm in fact 
provides a very useful subroutine in exact or heuristic algorithms for labeling general 
road graphs. Our initial experiments, sketched in Section]^ show that real-world road 
networks decompose Into small subgraphs, a large fraction of which (more than 85.1%) 
are actually trees, and thus can be labeled optimally by our algorithm. 


2 Preliminaries 

As argued above, a road map is a collection of fat curves in the plane, each representing 
a particular piece of a named road. If two (or more) such curves intersect, they form 
junctions. A road label is again a fat curve (the bounding shape of the road name) that is 
contained in and parallel to the fat curve representing its road. We observe that labels of 
different roads can intersect only within junctions and that the actual width of the curves 
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is irrelevant, except for defining the shape and size of the junctions. These observations 
allow us to define the following more abstract but equivalent road map model. 

A road map is a planar road graph G = (V, E) together with a planar embedding 
£(0), which can be thought of as the geometric representation of the road axes as thin 
curves; see Fig[^). We denote the number of vertices of G by n, and the number of 
edges by m. Observe that since G is planar m = 0{n). Each edge e G E is either a road 
section, which is not part of a junction, or a junction edge, which is part of a junction. 
Each vertex v gV is either a junction vertex incident only to junction edges, or a regular 
vertex incident to one road section and at most one junction edge, which implies that 
each regular vertex has degree at most two. A junction vertex v and its incident edges 
are denoted as a junction. The edge set E decomposes into a set TZ of edge-disjoint 
roads, where each road R G TZ induces a connected subgraph of G. Without loss of 
generality we assume no two road sections G are incident to the same vertex. Thus, a 
road decomposes into road sections, separated by junction vertices and their incident 
junction edges. In realistic road networks the number of roads connected passing through 
a junction is small and does not depend on the size of the road network. We therefore 
assume that each vertex in G has constant degree. We assume that each road R G TZ has 
a name whose length we denote by X{R). 

Eor simplicity, we identify the embedding EjG) with the points in the plane covered 
by E(G), i.e. E(G) C We also use E(u), E(e), and E{R) to denote the embeddings 
of a vertex v, an edge e, and a road R. 

We model a label as a simple open curve £: [0, 1] — E(G) in E(G). Unless men¬ 
tioned otherwise, we consider a curve £ always to be simple and open, i.e., £ has no 
self-intersections and its end points do not coincide. In order to ease the description, we 
identify a curve £ in E(G) with its image, i.e., £ denotes the set {£{t) G E(G) | f € [0,1]}. 
The start point of £ is denoted as the head h{£) and the endpoint as the tail t{£). The 
length of £ is denoted by len(f). The curve £ identifies a road section r if £nE(r)^0. 
Eor a set C of curves uj{C) is the number of road sections that are identified by the curves 
in C. For a single curve £ we use uj{£) instead of uj{{£}). For two curves £i and £2 it is 
not necessarily true that £ 2 }) = w(£i) + <^(£ 2 ), because they may identify the 

same road section twice. 

A label £ for a road i? is a curve £ C E(i?) of length \{R) whose endpoints must lie 
on road sections and not on junction edges or junction vertices. Requiring that labels end 
on road sections avoids ambiguous placement of labels injunctions where it is unclear 
how the road passes through it. A labeling C for a road map with road set 7?. is a set of 
mutually non-overlapping labels, where we say that two labels £ and £! overlap if they 
intersect in a point that is not their respective head or tail. 

Following the cartographic quality criteria (C1)-(C3), our goal is to find a labeling C 
that maximizes the number of identified road sections, i.e., for any labeling C we have 
oj{C') < oj{C). We call this problem MaxIdentifiedRoads. 

Note that assuming the road graph G to be planar is not a restriction in practice. 
Consider for example a road section r that overpasses another road section r', i.e., r is a 
bridge over r', or r' is a tunnel underneath r. In order to avoid overlaps between labels 
placed on r and r', we either can model the intersection of r and r' as a regular crossing 
of two roads or we split r' in smaller road sections that do not cross r. In both cases 
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Fig. 2. Illustration of NP-hardness proof, (a) 3 -Sat formula ip = {xi V V 15) A (*2 V 2:4 V 
X3) A {x2 V xi V X3) A (x3 V X5 V X4) represented as road graph Truth assignment is 
xi = true, X 2 = true, X 3 = false, X 4 = false and X 5 = false, (b) Clause gadget in two states, 
(c) The chain is the basic building block for the proof, (d) Schematized fork gadget. 


the corresponding road graph becomes planar. In the latter case we may obtain more 
independent roads created by chopping r' into smaller pieces. 


3 Computational Complexity 

We first study the computational complexity of road labeling and prove N P-hardness of 
MaxIdentifiedRoads in the following sense. 

Theorem 1. For a given road map A4 and an integer K it is NP-hard to decide if in 
total at least K road sections can be identified. 

Proof. We perform a reduction from the NP-complete PLANAR MONOTONE 3 -Sat 
problem Q. An instance of PLANAR MONOTONE 3 -Sat is a Boolean formula p with n 
variables and m clauses (disjunctions of at most three literals) that satisfies the following 
additional requirements: (i) p is monotone, i.e., every clause contains either only positive 
literals or only negative literals and (ii) the induced variable-clause graph of p is 
planar and can be embedded in the plane with all variable vertices on a horizontal line, 
all positive clause vertices on one side of the line, all negative clauses on the other side 
of the line, and the edges drawn as rectilinear curves connecting clauses and contained 
variables on their respective side of the line. We construct a road map that mimics 
the shape of the above embedding of by defining variable and clause gadgets, which 
simulate the assignment of truth values to variables and the evaluation of the clauses. 
We refer to Fig. [^for a sketch of the construction. 

Chain Gadget. The basic building block is the chain gadget, which consists of an 
alternating sequence of equally long horizontal and vertical roads with identical label 
lengths that intersect their respective neighbors in the sequence and form junctions with 
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ri r2 



Fig. 3. Illustration of the fork gadget. [^ Structure of the fork gadget. [^ Configuration transmitting 
the value/a/ie. [(^Configuration transmitting the value true. 


them as indicated in Fig. |^). Assume that the chain consists of fc > 3 roads. Then 
each road except the first and last one decomposes into three road sections split by two 
junctions, a longer central section and two short end sections; the first and last road 
consist of only two road sections, a short one and a long one, separated by one junction. 
(These two roads will later be connected to other gadgets; indicated by dotted squares in 
Fig.[^).) The label length and distance between junctions is chosen so that for each road 
either the central and one end section is identified, or no section at all is identified. For 
the first and last road, both sections are identified if the junction is covered and otherwise 
only the long section can be identified. We have k roads and fc — 1 junctions. Each label 
must block a junction, if it identifies two sections. So the best possible configuration 
blocks all junctions and identifies 2(fc — 1) + 1 = 2fc — 1 road sections. 

The chain gadget has exactly two states, in which 2fc — 1 road sections are identified. 
Either the label of the first road does not block a junction and identifies a single section 
and all subsequent roads have their label cover the junction with the preceding road in 
the sequence, or the label of the last road does not block a junction and all other roads 
have their label cover the junction with the successive road in the sequence. In any other 
configuration there is at least one road without any identified section and thus at most 
2fc — 2 sections are identified. We use the two optimal states of the gadget to represent 
and transmit the values true and false from one end to the other. 

Fork Gadget. The/orfc gadget allows to split the value represented in one chain into 
two chains, which is needed to transmit the tmth value of a variable into multiple clauses. 
To that end it connects to an end road of three chain gadgets by sharing junctions. 

The core of the fork consists of six roads ri,..., rg, whereas r^, r 2 , and are 
vertical line segments and r 4 , rg and rg are horizontal line segments; see Fig. We 
arrange those roads such that ri and r 2 have each one junction with r 4 and one junction 
with rg. Further, rg has one junction with r 4 , one with rg and one with rg. The label 
length of those roads is chosen so that it is exactly the length of the roads. Hence, a 
placed label idenfies all road sections of the roads. 

Further, there are three roads gi, g 2 , gs such that gi has one junction with ri, g 2 has 
one junction with r 2 and pg has one junction with rg. In all three cases we place the 
junction so that it splits the road in a short road section that is shorter than the road’s 
label length and a long road section that has exactly the road’s label length. We call gi, 
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52 and 53 gates, because later these roads will be connected to the end roads of chains 
by junctions. To that end those connecting junctions will be placed on the long road 
sections of the gates; see violet dotted areas in Fig.|^ 

The fork gadget has exactly two states, in which 16 road sections are identified. In 
the first state the labels of ri, and are placed; see Fig |3(b)| Hence, the labels of 51 
and 52 identify only the long road sections of 51 and 52, but not the short ones. The label 
of 53 idenfies both the long and short road section of 53. In the second state the labels of 
'Ti, fs, tq are placed; see Fig |3(c)| Hence, the labels of 51 and 52 identify the long and 
short road sections of 51 and 52, while only the long road section of 53 is identified by a 
label. In any other configuration fewer road sections are identified by labels. We use the 
two optimal states of the gadget to represent and transmit the values true and/a/ie from 
one gate to the other two gates. More specifically the gates 51 and 52 are connected with 
chains that lead to the same literal, while 53 is connected with a chain that leads to the 
complementary literal. 

Variable Gadget. We define the variable gadgets simply by connecting chain and fork 
gadgets into a connected component of intersecting roads. This constmction already has 
the functionality of a variable gadget: it represents (in a labeling identifying the maximum 
number of road sections) the same tmth value in all of its branches, synchronized by the 
fork gadgets, see the blue chains and yellow forks in Fig. [^). More precisely, we place a 
sequence of chains linked by fork gadgets along the horizontal line on which the variable 
vertices are placed in the drawing Each fork creates a branch of the variable gadget 
either above or below the line. We create as many branches above (below) the line as 
the variable has occmi'ences in positive (negative) clauses in ip. The first and last chain 
on the line also serve as branches. The synchronization of the different branches via the 
forks is such that either all top branches have their road labels pushed away from the 
line and all bottom branches pulled towards the line or vice versa. In the first case, we 
say that the variable is in the stats false and in the latter case that it is in the state true. 
The example in Fig.|^has two variables set to true and three variables set to false. 

Clause Gadget. Finally, we need to create the clause gadget, which links three 
branches of different variables. The core of the gadget is a single road that consists 
of three sub-paths meeting in one junction. Each sub-path of that road shares another 
junction with one of the three incoming variable branches. Beyond each of these three 
junctions the final road sections are just long enough so that a label can be placed on 
the section. However, the section between the central junction of the clause road and the 
junctions with the literal roads is shorter than the label length. The road of the clause 
gadget has six sections in total and we argue that the six sections can only be identified 
if at least one incoming literal evaluates to true. Otherwise at most five sections can be 
identified. By construction, each road in the chain of a false literal has its label pushed 
towards the clause, i.e., it blocks the junction with the clause road. As long as at least 
one of these three junctions is not blocked, all sections can be identified; see Fig.|^). 
But if all three junctions are blocked, then only two of the three inner sections of the 
clause road can be identihed and the third one remains unlabeled; see Eig.|^). 

Reduction. Obviously, the size of the instance is polynomial in n and m. If we 
have a satisfying variable assignment for ip, we can construct the corresponding road 
labeling and the number of identified road sections is six per clause and a fixed constant 
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number K' of sections in the variable gadgets, i.e., at least K = K' + 6m. On the other 
hand, if we have a road labeling with at least K identified sections, each variable gadget 
is in one of its two maximum configurations and each clause road has at least one label 
that covers a junction with a literal road, meaning that the corresponding truth value 
assignment of the variables is indeed a satisfying one. This concludes the reduction. 


Since MaxIdentifiedRoads is an optimization problem, we only present the NP- 
hardness proof. Still, one can argue that the corresponding decision problem is NP- 
complete by guessing which junctions are covered by which label and then using linear 
programming for computing the label positions. We omit the technical details. Further, 
most roads in the reduction are paths, except for the central road in each clause gadget, 
which is a degree -3 star. In fact, we can strengthen Theorem[^by using a more complex 


clause gadget instead that uses only paths; see Appendix A.l 


4 An Efficient Algorithm for Tree-Shaped Road Maps 


In this section we assume that the underlying road graph of the road map is a tree 



T = (y, E). In Section 4.1 we present a polynomial-time algorithm to optimally solve 

MaxIdentifiedRoads for trees; Section | 4 ^ shows 
how to improve its mnning time and space consump¬ 
tion. Our approach uses the basic idea that remov¬ 
ing the vertices, whose embeddings lie in a curve 
c C E(T), splits the tree into independent parts. In 
particular this is true for labels. We assume that T is 
rooted at an arbitrary leaf p and that its edges are 
directed away from p; see Fig. For two points 
p,q G E(T) we define d(p, q) as the length of the 
shortest curve in E(T) that connects p and q. For two 
vertices u and v of T we also write d(ri, v) instead 
of d(E(u), E(u)). For a point p G E{T) we abbre¬ 
viate the distance d{p, p) to the root p by dp. For a 
curve £ in E(T), we call p G t the lowest point of £ 
if dp < dg, for any g S £. As T is a tree, p is unique. 
We distinguish two types of curves in E(r). A curve £ 



Fig. 4. Basic definitions. 


is vertical if h{£) or £(£) is the lowest point of £; otherwise we call £ horizontal (see 
Fig.g. Without loss of generality we assume that the lowest point of each vertical 
curve £ is its tail £(£). Since labels are modeled as curves, they are also either vertical or 
horizontal. For a vertex u G y let T„ denote the subtree rooted at u. 


4.1 Basic Approach 

We first determine a finite set of candidate positions for the heads and tails of labels, 
and transform T into a tree T' = (y', E') by subdividing some of T’s edges so that it 
contains a vertex for every candidate position. To that end we construct for each regular 
vertex u G y a chain of tightly packed vertical labels that starts at E(u), is directed 
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towards p, and ends when either the road ends, or adding the next label does not increase 
the number of identified road sections. More specifically, we place a first vertical label (.i 
such that h{li) = E(n). For i = 2,3,... we add a new vertical label with h{£i) = 
as long as h{£i) and t{£i) do not lie on the same road section and none of f^’s 
endpoints lie on a junction edge. We use the tails of all those labels to subdivide the 
tree T. Doing this for all regular vertices of T we obtain the tree T', which we call the 
subdivision tree of T. The vertices in y' \ F are neither junction vertices nor regular 
vertices. Since each chain consists of 0(n) labels the cardinality of V' is 0{v?). We 
call an optimal labeling £ of T an canonical labeling if for each label £ ^ CJ there 
exists a vertex v in T' with E(u) = h(£) or E(r;) = t{t). The next lemma proves that is 
sufficient to consider canonical labelings. 

Lemma 1. For any road graph T that is a tree, there exists a canonical labeling L. 

Proof. Let £ be an optimal labeling of T. We push the labels of £ as far as possible 
towards the leaves of T without changing the identified road sections; see Fig. 1^ More 
specifically, starting with the labels closest to the leaves, we move each label away from 
the root as far as possible while its head and tail must remain on their respective road 
sections. For a vertical label this direction is unique, while for horizontal labels we can 
choose any of the two. Then, for each label its head or tail either coincides with a leaf of 
T, with some internal regular vertex, or with the head of another label. Consequently, 
each vertical label belongs to a chain of tightly packed vertical labels starting at a regular 
vertex v G V. Further, the head or tail of each horizontal label coincides with the end 
of a chain of tightly packed vertical labels or a regular vertex of T, which proves the 
claim. □ 

We now explain how to construct such a canonical labeling. To that end we first 
introduce some notations. For a vertex u £ V' let C{u) denote a labeling that identifies 
a maximum number of road sections in T only using valid labels 
in E(T^), where denotes the subtree of T' rooted at u. Note 
that those labels also may identify the incoming road section of 
u, e.g., label £ in Fig.[^) identifies the edge e'. 

Further, the children of a vertex u £ V are denoted by the 
set N{u)\, we explicitly exclude the parent of u from N{u). 
Further, consider an arbitrary curve £ in E(r) and let £' = 
£ \ {£{£), h{£)}. We observe that removing all vertices of T' 
contained in £' together with their incident outgoing edges creates 
several independent subtrees. We call the roots of these subtrees 
(except the one containing p) children of £ (see Fig.|^. If no vertex 
of T' lies in £', the curve is contained in a single edge {u, v) £ E'. 
In that case v is the only child of £. We denote the set of all 
childi'en of £ as N{£). 

For each vertex u in T' we introduce a set C (u) of candidates, 
which model potential labels with lowest point E(rt). If m is a 
regular vertex of T or u £ V \ V, the set C{u) contains all 
vertical labels £ with lowest point E(u). If u is a junction vertex, 
C{u) contains all horizontal labels that start or end at a vertex of 






Fig. 5. Canonical la¬ 
beling. 
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T' and whose lowest point is E(it). In both cases we assume that C{u) also contains the 
degenerated curve = E(tt), which is the dummy label of u. We set 7V(_L„) = N{u) 
and a;(_L„) = 0. 

For a curve t we define C{£) = Ut;GiV(r) Thus, £(£) is a labeling 

comprising I and the labels of its children’s optimal labelings. We call a label 1 G C(u) 
with I = argmax{a;(£(£)) | i G C'(u)} an optimal candidate of u. Next, we prove that 
it is sufficient to consider optimal candidates to construct a canonical labeling. 

Lemma 2. Given a vertex u ofT' and an optimal labeling C{u) and let I be an optimal 
candidate of u, then it is true that uj{C{u)) = uj(£(£)). 

Proof. First note that oj{C{u)) > uj(£(£)) because both labelings £(u) and £(£) only 
contain labels that are embedded in E(T^). By Lemma[^we can assume without loss of 
generality that £(u) is a canonical labeling. Let £ be the label of £(u) with E(u) as the 
lowest point of £ (if it exists). 

If £ exists, then the vertices in 1V(£) are roots of independent subtrees, which directly 
yields lo{£{u)) = uj{£{£)). By constmction of C{u) we further know that £ is contained 
in C{u). Hence, £ is an optimal candidate of u, which implies uj{£) = uj{£). 

If £ does not exist, then we have 

u}{£{u))=uj{ y £{v))=uj{ y £(?;) U {_L„}) = a;(£(_L„)). 

v^N{u) v^N (J-u) 

Equality (1) follows from iV(_L„) = N{u) and the definition that _L„ does not identify 
any road section. Since £u is contained in C{u), the dummy label _Lu is the optimal 
candidate £. □ 

Algorithm first constructs the subdivision tree T' = (y',E') from T. Then 
starting with the leaves of T' and going to the root p of T', it computes an optimal 
candidate £ =OptCandidate (u) for each vertex u gV bottom-up fashion. By 
Lemmaj^the labeling £(£) is an optimal labeling of In particular £{p) is the optimal 
labeling of T. 

Due to the size of the subdivision tree T' we consider 0{'nf) vertices. Implement¬ 
ing OptCandidate(u), which computes an optimal candidate £ for u, naively, cre¬ 
ates C{u) explicitly. We observe that if u is a junction vertex, C{u) may contain 0{nf) 
labels; 0 (v?) pairs of road sections of different subtrees of u can be connected by 


Algorithm 1; Computing an optimal labeling of T. 

Input: Road graph T, where T is a tree with root p. 

Output: Optimal labeling £{p) of T. 

1 T' compute subdivision tree of T 

2 for each leafv ofT' do £{v) -f- 0 for each vertex u ofT' considered in a bottom-up 
traversal ofT' do 

3 ^ £{£) L(OptCandidate(u)) 

4 return L(p) 
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horizontal labels. Each label can be constructed in 0{n) time using a breadth-first search. 
Thus, for each vertex u the procedure OptCandidate needs in a naive implementation 
Oin^) time, which yields 0{n^) running time in total. Further, we need 0{n^) storage 
to store T'. Note that we do not need to store C{u) for each vertex u of T', but by 
Lemma|^we can reconstmct it using C(I), where i is the optimal candidate of u. To that 
end we store for each vertex of T' its optimal candidate £ and w{C(l)). 

Theorem 2. For a road map with a tree as underlying road graph, MaxIdentified- 
ROADS can be solved in Olrt’) time using 0{n^) space. 

In case that all roads are paths, Algorithm[^runs in O(n^) time, because for each 
u G V' the set C{u) contains 0{n) labels. Further, besides the primary objective 
to identify a maximum number of road sections, Chirie |jTJ also suggested several 
additional secondary objectives, e.g., labels should overlap as few junctions as possible. 
Our approach allows us to easily incorporate secondary objectives by changing the 
weight function w appropriately. 


4.2 Improvements on Running Time 

In this part we describe how the running time of Algorithmcan be improved to 0{v?) 
time by speeding up OptCandidate(u) to 0{n) time. 

For an edge e = {u,v) G EUE' we call a vertical curve £ C E(T) an e-rooted curve, 
if £(£) = E(m), fi(£) lies on a road section, and len(E(e) n£) = min{len(£), len(E(e))}, 
i.e., £ emanates from E(u) passing through e; for example the red label in Fig. |^) 
is an e-rooted curve. An e-rooted curve £ is maximal if there is no other e-rooted 
curve £' with len(£) = len(£') and uj(£(£')) > w(£(£)). We observe that in any 
canonical labeling each vertical label £ is a {u, rij-rooted curve with {u, v) G E', and 
each horizontal label £ can be composed of a (m, r;i)-rooted curve £i and a (it, W 2 )- 
rooted curve £2 with (it, ni), (u, V 2 ) G E' and E(u) is the lowest point of £; see Fig.|^ 
and Fig. 1^ respectively. Further, for a vertical curve c in E(T) its distance interval 
I{c) is [dt(c); d/i(c)]. Since T is a tree, for every point p of c we have dp G 1(c). 
Two vertical curves c and c' superpose 
each other if /(c) n / (d) 7 ^ 0 ; see Fig|^ 

Next, we introduce a data structure 
that encodes for each edge (it, v) of T 
all maximal (it, i;)-rooted curves as 0 {n) 
superposition-free curves in E(T„). In par¬ 
ticular, each of those curves lies on a sin¬ 
gle road section such that all (it, i;)-rooted 
curves ending on that curve are maximal 
and identify the same number of road sec¬ 




Fig. 6. Superposing curves, e.g., ci and C2 super¬ 
pose each other, while ci and C5 do not. The tree 
is annotated with distance marks. 


tions. We define this data structure as follows. 


Definition 1 (Linearization). Let e = (it, v) be an edge ofT. A tuple (L, uj) is called 
a linearization of e, if L is a set of superposition-free curves and oJ: L — M such that 
(1) for each curve c G L there is a road section e' in T„ with c C E(e'), 
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(2) for each e-rooted curve £ there is a curve c G L with len(£) + G ^(c), 

(3) for each point p of each curve c G L there is a maximal e-rooted curve £ with 
h{£) =panduj{c) = u) {£{£)). 

Assume that we apply Algorithm[^on T' and that we currently consider the vertex 
u of T'. Hence, we can assume that for each vertex v u of its optimal candi¬ 
date and uj{C{v)) is given. We first explain how to speed up OptCandidate using 
linearizations. Afterwards, we present the construction of linearizations. 


w 


•P 

> HR) 


< 

^e' 
w 

u 

•P 

< HR) 


Application of linearizations. Here we assume that the linearizations are given for the 
edges of T. Concerning the type of u we describe how to compute its optimal candidate. 

Case 1, u is regular. If it is a leaf, the set C{u) contains only _L„. Hence, assume 
that u has one outgoing edge e = (it, v) G E', which belongs to a road R. Let P be the 
longest path of vertices in that starts at it and does not contain any junction vertex. 
Note that the path must be unique. Further, by construction of T' the last vertex w of 
P must be a regular vertex in V, but not in V' \ V. We consider two cases; see Fig[7j 

If d(it, lu) > A(i?), the optimal candidate is either _L„ or 
the e-rooted curve £ of length A(i?) that ends on E(P). By 
assumption and due to a;(£(_L„)) = a; (£(!>)), we decide in 
0(1) time whether tLi(£(_Lu)) > !jj{C{£)), obtaining the optimal 
candidate. 

If d(it, lu) < A(i?), the optimal candidate is either _L„ or 
goes through a junction. Since w is regular, it has only one 
outgoing edge e' = {w,x). Further, by the choice of P the 
edge e' is a junction edge in T; therefore the linearization (L, w) 
of e' is given. In linear time we search for the curve c G L such 
that there is an e-rooted curve £ of length A(i?) with its head 
on c. To that end we consider for each curve c S L its distance 
interval /(c) and check whether there is f G /(c) with f — d„ = X{R). Note that using a 
binary search tree for hnding c speeds this procedure up to 0 (log n) time, however, this 
does not asymptotically improve the total running time. The e-rooted curve £ then can 
be easily constructed in 0{n) time by walking from c to it in E(r). 

If such a curve c exist, by definition of a linearization the optimal candidate is 
either _L„ or £, which we can decide in 0(1) time by checking a;(£(_L„)) > a;(£(/)). 

Note that we have a;(£(_L„)) = u){C{v)) and uj{C{£)) = cc(c). 
If c does not exist, again by definition of a linearization there is 
no vertical label £ G C(u) and is the optimal candidate. 

Case 2, u is a junction vertex. The set C (it) contains hor¬ 
izontal labels. Let £ be such a label and let ei = (it, Ui) 
and e 2 = {u^vf) be two junction edges in E covered by /; 
see Fig. Then there is an ei-rooted curve £i and an e 2 - 
rooted curve £2 whose composition is £. Further, we have 
uj{C[£)) = uj{C{£i) U £(£ 2 )) + E«Gtv(«)\{«i,« 2 } We use this as follows. 

Let Cl and 62 be two outgoing edges of it that belong to the same road R, and let 
(£ 1 , wi) and (£ 2 , W 2 ) be the linearizations of ci and 62 , respectively. We define for Ci 


Fig. 7 . Case 1 



Fig. 8. Case 2 
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/(c4) /(ce) 


r(=8) 


and 62 and their linearizations the operation opt-cand(Li, L2) that finds an optimal 
candidate of u restricted to labels identifying ei and 62- 

For i = 1 , 2 let di = maxjd^ | u is vertex of T„. } and let /^(f) = d„ — (t — d„) = 
2d„—f be the function that “mirrors” the point t G at d^- 

Applying fu(i) on the boundaries of 
the distance intervals of the curves in 
Li, we first mirror these intervals such 
that they are contained in the inter¬ 
val [ 2 d„ —di,d„]; see Fig. Thus, 
the curves in Li U L2 are mutually 
superposition-free such that their dis¬ 
tance intervals lie in J = [2 d„ —di, d2\- 
We call an interval [a;, y] C J a win¬ 
dow, if it has length A(i?), d^ G [x,y\ 
and there are curves ci G Li and C2 G 
L-2 with X G I{ci) and y G /(C2); 
see Fig. By the definition of a lin¬ 
earization there is a maximal ei-rooted 
curve £1 ending on ci and a maximal 
62-rooted curve £2 ending on ci such 
that len(£i) -f len(£2) = A(i?). Conse¬ 
quently, the composition of l\ and £2 forms a horizontal label £ with w(/:(£)) = 
w(£(£i) U C(li)) + the value of the window. 

Using a simple sweep from left to right we compute for the distance interval /(c) of 
each curve c G £1 U L2 a window [x, y] that starts or ends in /(c) (if such a window 
exists). The result of opt-cand(Li, L2) is then the label £ of the window with maximum 
value. For each pair ei and 62 of outgoing edges we apply opt-cand(Li, £2) computing 
a label £. By construction either the label £ with maximum uj{£) or _L„ is the optimal 
candidate for u, which we can check in 0 ( 1 ) time. Later on we prove that we consider 
only linearizations of linear size. Since each vertex of T' has constant degree, we obtain 
the next lemma. 



J(cg) 


Fig. 9 . Constructing the optimal candidate of u 
based on the linearizations (£1, oJi) and (£2, aJ2). 
The tree is annotated with distance marks. 


Lemma 3. For each u G V' the optimal candidate can be found in 0 {n) time. 


Construction of linearizations. We now show how to recursively constmct a lineariza¬ 
tion for an edge e = (u, v) of T. To that end we assume that we are given the subdivision 
tree T' of T and the linearizations for the outgoing edges ci = {v,wi),... ,ek = {v,Wk) 
of V that belong to the same road R as e. Further, we can assume that we have computed 
the weight uj{C{w)) for all vertices w in excluding u. In case that two vertices of 
those vertices share the same position in £(£()) we remove that one with less weight. 
Let Ti be the tree induced by the edges e, Ci and the edges of the subtree rooted at Wi. 
As a first step we compute for each linearization (£, w) of each edge Ci a linearization 
for 6 restricted to tree Ti, i.e., conceptually, we assume that only consists of 
Tfs edges. 
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Fig. 10. 1 st Step: For each edge d Fig. 11. 2 nd Step: Merging the linearizations of the trees 
extend its linearization (L, oJ) to a Ti and Tj. 
linearization ofTi. 


If e is a junction edge we set Li ^ L and weight each curve c £ Li as, follows. 

dJj(c) dJ(c) + ^w(/:(w)) 

w£N{v)\{wi} 


Otherwise, if e is a road section, let ,..., z;; be the vertices of the subdivision 
tree T' that lie on e, i.e., E(uj) G E(e) for all 1 < j < ^; see Fig. 


10 


We assume that 


d(z;i) < ... < d(vi), which in particular yields vi = u and vi = v. Let ci be the 
curve E((z;i,z;2)) and for 2 < j < llet Cj be the curve E{{vj,Vj+i)) \ E{vj). Hence, 
we have Cj = E(e) and cj n Cj' = 0 for 1 < j < / < I- We set 


i-i 

Li^ LU [J{cj} 
f=i 

We weight each curve c G Li as follows. If c is contained in L, we set 

Uji{c) G- UJ(c) + 1 

Otherwise, c is a sub-curve of E(e) and there exists a j with c = Cj. We set 

UJi(c) G- Ul(C(Vj + i) U { 4 }), 

where £c C E(e) is an e-rooted curve that starts at E(m) and ends on c. The next lemma 
shows that this transformation yields a linearization as desired. 

Lemma 4. For each outgoing edge with linearization (L, w) the tuple {Li,uji) is a 
linearization of e restricted to the tree Ti. 

Proof. We use the same notation as used above. 

First of all, the set Li contains only curves that do not superpose each other: Since 
Li contains only curves that do not superpose each other, the only curves that could 
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superpose another curve in L are contained m Li \ L. Since Li \ L is empty for a 
junction edge, we can assume that e is a road section. By construction those curves in 
Li\L partition E(e) without intersecting each other. Further, by assumption no two road 
sections share a common vertex and since all curves of L are contained in E(r„), the 
curves in \ L cannot superpose any curve in L. 

We now prove that Li satishes the three conditions of a linearization. First assume 
that e is a road section. 

Condition 0- Since L is a linearization, each curve of L must be a sub-curve of a 
road section. Further, the curves Li\ L aie sub-curves of the road section e. 

Condition 0 . First consider an e-rooted curve i that either ends on Ci or on an edge 
of Tjju-. Recall that h{£) must lie on a road section. Then there is an enrooted curve £! 
with £' C £ and h{C) = h{i'\ Hence, there is a curve c £ L with len(£') -f S I{c). 
Since £' is a sub-curve of £, we also have len(£) -f S I{c). Now, consider an e-rooted 
curve £ that ends on e, then obviously by construction there is a curve c G Li\ L with 
len(£) -I- d„ G /(c). 

Condition 0 . First consider an arbitrary curve c G Li \ L and let £ be any e-rooted 
curve that ends on c. Further, let r;i,..., w; be the vertices of the subdivision tree T' that 
lie on e as defined above. By construction there is an edge (cj , Cj+i) with 1 < j < ^ 
and c C E(tij, It holds 

a;(£(£)) = Lo{C{vj+i) U {£}) = u}i{c) 

Obviously, £ must be maximal, because there is no other point in E{Ti) having the same 
distance to p as h{£) has. 

Finally, consider a curve c G L and let £ be any e-rooted curve that ends on c. 
As L is a linearization of Ci, for each point p on c there must be an -rooted curve £' 
with /i(£') G c. We choose £' such that h{£') = h{£). Since £' is a maximal enrooted 
curve, the curve £ must be a maximal e-rooted curve. Further, £ identifies one road 
section more than £'. Hence, we obtain 

u;(C(£)) = uj(£(£')} -f 1 = w(c) -I- 1 = uJi(c) 

Now consider the case that e is a junction edge. Condition 0 and Condition 0 
follow by the same arguments as stated above with the simplihcation that Lj = L. 

Condition 0 . Let c be a curve in Li and let £ be any e-rooted curve that ends on c. 
Further, let £' be the e^-rooted sub-curve of £ that starts at E(?;) and ends at h(£); by 
dehnition of L such a curve exists. It holds 

a;(£(£)) = a;(>C(/)) + = uJ(c) + = Wi(c) 

w£N{v)\{wi} w£N{v)\{wi} 

Since £' is a maximal e^-rooted curve, it directly follows that £ is a maximal e-rooted 
curve with respect to Ti. □ 

In the next step we define an operation 0 by means of which two linearizations 
{Li,uji) and {Lj,ujj) can be combined to one linearization {Li,uji) 0 {Lj,ujj) of e 
that is restricted to the subtree Tij induced by the edges of Ti and Tj. Consequently, 
®i^i{Li,uJi) is the linearization of e without any restrictions. 
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J(C 4 ) J(C 2 ) 

^(ce) ^(ca) 


^(= 9 ) 


++ 


du + 10 


/(cq) /(-■-) /{c 8 )/(c')/(c 

r-l■l■■l ■■!■ ■ ■[■■ii 

/(c3)7(c2)/(ci) 


h 


H—I- 


++ 


H—I- 


Fig. 12. Illustration of merging two linearizations {Li,u)i) and (Lj,LUj) into one linearization 
(I/i, oJi). The trees are annotated with distance marks. 


We define (L, w) = (Li, tOi) 0 {Lj,ujj) as follows; for illustration see also Fig. 
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Let Cl,..., Q be the curves of Li U Lj such that for any two curves Cg, Ct with s < t 
the left endpoint of /(cg) lies to the left of the left endpoint of /(c*); ties are broken 
arbitrarily. We successively add the curves to L in the given order enforcing that the 
curves in L remain superposition-free. Let c be the next curve to be added to L. 

Without loss of generality, let c G Li. The opposite case can be handled analogously. 
In case that there is no curve superposing c, we add c to L and set cU(c) = cJi(c). If c 
superposes a curve in L, due the order of insertion, there can only be one curve c' in L 
that superposes c. First we remove c' from L. Let Im be the interval describing the set 
/(c)n/{c'), and let and Iji be the intervals describing the set /(c)U/(c')\(/(c)n/(c')) 
such that lies to the left of Im and Ifj lies to the right of Im', see Fig.[TT| 

We now define three curves cl, cm and cr with I{cl) = Il, I {cm) = Im and 
I{cr) — Ir such that each of these three curves is a sub-curve of either c or c'. To that 
end let c[I] denote the sub-curve of c whose distance interval is I. We define the curve 
Cr with weight oJ{ cr) as 


{cr,uj{cr)) 


{c[Ir],uj^{c)), if/flC/(c) 
ic'[lR]Mc')), if/flC/(c') 


The curve cr and its weight uj{cl) is defined analogously. The curve cm and its weight 

w(cm) is 


(cm, ^^{cm)) 


{c[lM],^^i{c)), if UJiic) > Uj{c') 
(c'[/m],w(c')), if Wz(c) < w(c') 


The next lemma proves that {Li,uji) 0 {Lj,ujj) is a restricted linearization. 


Lemma 5 . Let (Li,aJi) and {Lj,Zdj) be two linearizations of e = {u,v) that are re¬ 
stricted to the trees Ti and Tj, respectively. Then {L,uj) = (L^, wi) 0 {Lj,oJj) is a 
linearization of e restricted to Tij. The operation needs 0(|Li| 0 \Lj\) time. 
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Proof. First of all, the set L contains only curves that are pairwise free from any super¬ 
positions. This directly follows from the construction that curves c and c' superposing 
each other are replaced by three superposition-free curves c^, cm and cr. Due to 
I{cl) U I {cm) U I{cr) = /(c) U I{c') the first and second condition of a linearization 
is satisfied. 

We finally prove that Condition 0 of a linearization is satisfied by doing an induction 
over the curves inserted to L. Let be L after the k-th insertion step. Since is empty, 
the condition obviously holds for So assume that we insert c to obtaining the set 
Without loss of generality assume that c S Li.lfc does not superpose any curve 
in L^, the condition directly follows from the definition of c. So assume that c' S 
superposes c. Since c & Li, the curve c' is contained in E(Tj). We remove c' from 
and insert the curves cr, cm and cl as defined above. We prove that all three curves 
satisfy Condition 0- 

Consider in the following the subtree Tij of restricted to the edges of Ti and Tj. 
We set cr = c\Ir\ and set uj{cR) = uji{c), if Ir C /(c). In that case there is no 
e-rooted curve ^ C ^{Tj) with len(/) -f d„ S Ir, i.e., either there is no curve I in 
E(Tj) with t{£) = E(it) and len(/) -f d„ S Ir, or any curve in E(Tj) with t{£) = E(tt) 
and len(/) + du & Ir ends on a junction edge. Consequently, any e-rooted curve £ 
with len(/) + du & Ir and in particular any maximal e-rooted curve £ with len(/) + d„ C 
Ir lies in E(Ti). Thus, the curve cr satisfies Condition 0 . For the case Ir C I{c') and 
the curve c^ we can argue analogously. 

So consider the curve cm- Without loss of generality we assume that uJi{c) > w(c'). 
The opposite case can be handled analogously. For any maximal e-rooted curve £ in E(Tj) 
with len(/) -f d„ € Im it must be true that w(£(/)) < uj{cM)- Further, since cm L c 
and c satisfies condition Q with respect to Ti, cm satisfies the condition Q with respect 

toTij. □ 

Lemma l^and Lemma|^yield that {Li ,u}i) is the linearization of e without any 
restrictions. Computing it needs 0 (X]^=i \^i\) time. 

Note that when computing optimal candidates (see Application of linearizations) 
we are only interested in e-rooted curves £ that have length at most \{R), where R is 
the road of e. Hence, when constructing {Li,IDi) for an edge Ci in the first step, we 
discard any curve c of Li that does not allow an e-rooted curve that both ends on c 
and has length at most A(i?); the curve c is not necessary for our purposes. Hence, we 
conceptually restrict R to the edges that are reachable from u by one label length. It 
is not hard to see that T' restricted to E{Ti) contains only 0{n) vertices, because each 
vertex of \ H is induced by a chain of tightly packed vertical labels, whereas each 
label has length X{R). Hence, T' restricted to E(Ti) contains for each such chain at 
most one vertex of V \ V. Further, the endpoints of the curves in Li are induced by the 
vertices of T' . Hence, by discarding the unnecessary curves of Li the set Li has size 
0{n). Altogether, by Lemma|^and due to the constant degree of each vertex we can 
construct 0^=1 (-/a ^i) in 0(X]iLi 

When constructing C{u) for u as described in Algorithm [T[ we first build the lin¬ 
earization Le of each of it’s outgoing edges. By Lemmaj^we can find in 0 {n) time the 
optimal candidate of u. Then, due to the previous reasoning, the linearization of an edge 
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of T and the optimal candidate of a vertex u can be constructed in 0 {n) time. Altogether 
we obtain the following result. 

Proposition 1. For a road map M. with a tree T as underlying road graph, M AXlDEN- 
TIFIEDRoads can be solved in 0 {n^) time. 


4.3 Improvements on Storage Consumption 


Since T' contains 0 {n'^) vertices, the algorithm needs 0 {'n?) space. This can be im¬ 
proved to 0 {n) space. To that end T' is constructed on the fly while executing Algo- 
rithm[T] Parts of T' that become unnecessary are discarded. We prove that it is sufficient 
to store 0 {n) vertices of T' at any time such that the optimal labeling can still be 
constructed. 

When constructing the optimal labeling of T, we build for each edge {u, v) of T its 
linearization based on the linearization of the outgoing edges of v. 

Afterwards we discard the linearizations of those outgoing 
edges. Since each vertex has constant degree, considering 
the vertices of T' in an appropriate order, it is sufficient to 
maintain a constant number of linearizations at any time. 

Hence, because each linearization has size 0 (n), we 
need 0 {n) space for storing the required linearizations 
in total. However, we store for each vertex u of T' the 
weight uj{C{u)) and its optimal candidate. As T' has size 
0 {n^) the space consumption is 0 {n^). In the following 
we improve that bound to 0 {n) space. 

We call a vertex v G V reachable from a vertex u G V, 
if there is a curve i C E(T^) that starts at E(m) and that is 
contained in the embedding of a road R with A(i?) > len(£) 

The 



Fig. 13. Vertices not reach¬ 
able from u are marked gray. 

such that E(v) G lor v G N{i), where len(£) denotes the length of l\ see Fig. 13 
set Ru contains all vertices of that are reachable from u. The next lemma shows 
that R„ has linear size. 
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Lemma 6. For any vertex u ofT' the set has size 0 {n). 


Proof. Recall how T' is constructed: For each vertex v G V we construct a chain C 
of tightly packed vertical valid labels, which starts at E(r;), is directed towards p, and 
ends when either the road ends, or adding the next label does not increase the number of 
identihed road sections. Each label of such a chain C induces one vertex of T'. Hence, C 
induces a set Vc of vertices in T'. We show that for each chain C the set Vc H R„ 
contains at most two vertices. As we construct n chains in order to build T' the claim 
follows. 

For the sake of contradiction assume that there is a chain C and a vertex u in T' such 
that Vc n R„ contains more than two vertices. Without loss of generality we assume 
that Vc n R,j contains three vertices, which we denote by vi, V2 and V3. We further 
assume that < d„ 3 . By construction all labels in C lie in the embedding 

of the same road Rc, and d(r;i,W2) > M^c) d(u2,r)3) > X{Rc). By dehnition 
of C there is a vertical curve £ G E(T^) that starts at E(it) and contains vi, V2 and V3. 
Let e be the outgoing edge of u in T' whose embedding is covered by £ and consider 
the sub-curve £' G- £ with length A(i?c) that starts at u. By dehnition of R„, we know 
for each vi with 1 < i < 3 that either its embbeding is contained in £' or Vi G N{£'). 
From the dehnition of N{£') and the fact that all three vertices lie on £, it directly 
follows that only may be contained in N{£'). Hence, E(ui), E(u2) G £'. Further, 
because V2 ^ N{£'), we have E(v2) 7^ h{£'), which implies d(ui,U2) < X{R) and 
contradicts d(ui, U2) > A(i?). □ 


Assume that we apply Algorithm [T] considering the vertex it. When constructing 
It’s optimal candidate, by Lemma|^it is sufficient to consider the vertices of that lie 
in R„. On that account we discard all vertices of that lie in V' \ V, but not in R^. 

Further, we compute the vertices of \ H that subdivide the 
incoming edge {t, u) G E on demand, i.e., we compute them, 
when constructing the optimal candidate of t. Hence, we have 
linear space consumption. 

However, when discarding vertices of T', we lose the possibility 
of reconstructing the labeling. We therefore annotate each ver¬ 
tex u G H of the original tree T with further information. To that 
end consider a canonical labeling C of T. Let f be a horizontal 
label of C and let e be the edge of T on which f’s head is located. 
Either, no other label of C ends on e, or another label £' ends 
on e that belongs to a chain ag of tightly packed vertical labels. 
Analogously, we can dehne the chain with respect to edge e' 
on which £'s tail is located. On that account we store for a junction vertex u gV not 
only its optimal candidate £ G C{u), but also the two chains ag and r^, if they exist. 
Note that such a chain of tightly packed vertical labels is uniquely dehned by its start 
and endpoint, which implies that 0 ( 1 ) space is sufficient to store both chains. Using a 
breadth-hrst search we can easily reconstruct those chains in linear time. For a regular 
vertex u G V we analogously store ag of its optimal candidate £ G C(u), if it exists. 
Since £ is vertical, we do not need to consider its tail. For the special case that £ = _L„, 
we dehne that ai is the chain of tightly packed vertical labels that ends on the only 



Fig. 14. Chains of la¬ 
bel 1 . 
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outgoing edge e of u. Summarizing, the additional information together with the optimal 
candidates stored at the vertices u G V of the original tree are sufficient to reconstruct 
the labeling of T. Together with Proposition [T] we obtain the following result. 

Theorem 3. For a road map Ai with a tree T as underlying road graph, MaxIdenti- 
FIEDROADS can be solved in 0(n^} time using 0{n) space. 

5 On the Usefulness of Labeling Tree-Shaped Road Networks 

Although the underlying road graphs of real-world road maps are rarely trees, our algo¬ 
rithm for labeling trees is still of practical interest as we show in first initial experiments. 
The obtained data shall give the reader evidence of the practicability and relevance of our 
algorithm, but they are not yet a complete experimental study. For a companion paper 
we are working on a detailed evaluation of our approach and are investigating several 
practical heuristics that are based on the tree labeling algorithm. 

To evaluate the usefulness of our algorithm we considered the road networks of sev¬ 
eral large cities. We extracted the road graphs from the data provided by OpenStreetMa]:j^ 
and drew them mimicking the style used on openstreetmap . org as standard. In 
particular, we adapted the zoom level 17, which maps 50m to 65 pixels. 

On each road graph we first applied a simple preprocessing strategy removing and 
cutting road sections that can be labeled trivially without violating any optimal solution. 
In particular we applied the following rules. 

1. Remove any road that contains exactly one road section. 

2. Remove any road section that is sufficiently long to completely contain a label 
and whose adjacent road sections are also sufficiently long to completely contain a 
label. Here two road sections are called adjacent, if they are connected by a path 
containing only junction edges. 

3. Cut any road section into two halves that is sufficiently long to contain a label twice 
in a row. 

That preprocessing strategy decomposed the road graphs into a large number of 
subgraphs; see TableFor example, for the road network of London, which consists 
of 143856 road sections, the rules of the preprocessing strategy matched 91405 road 
sections, so that the road graph decomposed into 21825 subgraphs. Note that if we are 
able to label those subgraphs optimally, we obtain an optimal labeling for the whole 
road network by the choice of the preprocessing rules. Table [T] further shows that most 
of those subgraphs are trees (85.1% for Berlin as a minimum and 97.7% for Los Angeles 
as a maximum). Hence, using our tree labeling algorithm we can label a large number of 
the remaining subgraphs optimally. We conjecture that using the preprocessing strategy 
in combination with the tree labeling algorithm and some heuristics or exact methods for 
the non-tree subgraphs we can label real-world instances near-optimally. This hypothesis 
is also supported by the observation that most of the road sections are either matched 
by the preprocessing strategy or are contained in trees (55.7% -f 32.9% = 88.6% for 
Paris as a minimum and 67.5% -f 28.6% = 96.1% for Los Angeles as a maximum). For 
our planned companion paper we are currently working on corresponding experiments 

’openstreetmap.org 
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Table 1. Number of connected subgraphs and road sections for road networks of five cities. 
The column subgraphs contains the number of connected subgraphs into which the graph is 
decomposed after preprocessing: 1. the total number of subgraphs, 2. the number of trees, 3. the 
number of subgraphs with one cycle, and 4. the number of subgraphs with more than one cycle. 
The column road sections contains the number of road sections 1. in total, 2. matched by the 
preprocessing strategy, 3. contained in trees, 4. contained in subgraphs with one cycle and 5. 
contained in subgraphs with more than one cycle. 


Number of 

subgraphs (after preprocessing) 


road sections 


total trees 

1 cycle > 2 cycles 

total 

matched 

trees 

1 cycle > 2 cycles 

Berlin 

5702 4853 

549 

300 

49773 

36021 

8220 

2170 

3362 

100% 85.1% 

9.6% 

5.3% 

100 % 

72.4% 

16.5% 

4.4% 

6 .8% 

Paris 

22929 20604 

1742 

583 

145971 

81305 

48009 

8329 

8328 

100% 89.9% 

7.6% 

2.5% 

100 % 

55.7% 

32.9% 

5.7% 

5.7% 

London 

21825 20538 

1012 

275 

143856 

91405 

44845 

4485 

3121 

100% 94.1% 

4.6% 

1.3% 

100 % 

63.5% 

31.2% 

3.1% 

2 .2% 

Los Angeles 

48248 47131 

767 

350 

397505 

268334 

113842 

5149 

10180 

100% 97.7% 

1 .6% 

0.7% 

100 % 

67.5% 

28.6% 

1.3% 

2 .6% 

New York City 

10318 9817 

306 

195 

108417 

72057 

25549 

3011 

7800 

100% 95.1% 

3% 

1.9% 

100 % 

66.5% 

23.6% 

2 .8% 

7.2% 


investigating that conjecture. Further, we are developing heuristics and exact algorithms 
for labeling the remaining non-tree subgraphs. 

For example we can improve our results by adapting our tree labeling algorithm to 
subgraphs containing exactly one cycle C. We observe that there are three cases for 
such a subgraph: (1) no label identifies any road section of C, (2) there is a label ^ that 
identifies only road sections of C, or (3) there is a label £ that identifies both road sections 
of C and road sections of the remaining component. In the first case we can remove C 
completely from the subgraph, such that it decomposes into a set of trees. In the second 
and third case the label £ splits the cycle C so that the remaining road sections form 
trees. We explore all choices of I taking the best choice. Hence, we can label subgraphs 
containing exactly one cycle optimally, which further increases the number of optimally 
labeled subgraphs (92.8% for New York City as a minimum and 97.8% for London as a 
maximum). 


6 Conclusions and Outlook 

In this paper we investigated the problem of maximizing the number of identified road 
sections in a labeling of a road map; we showed that it is NP-hard in general, but can be 
solved in Oiv?') time and linear space for the special case of trees. 

The underlying road graphs of real-world road maps are rarely trees. Initial exper¬ 
imental evidence indicates, however, that road maps can be decomposed into a large 
number of subgraphs by placing trivially optimal road labels and removing the cor- 
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responding edges from the graph. It turns out that between 85.1% and 97.7% of the 
resulting subgraphs are actually trees, which we can label optimally by our proposed 
algorithm. As a consequence, this means that a large fraction (between 88.6% and 96.1%) 
of all road sections in our real-world road graphs can be labeled optimally by combining 
this simple preprocessing strategy with the tree labeling algorithm. We are investigating 
further heuristic and exact approaches for labeling the remaining non-tree subgraphs 
(e.g., by hnding suitable spanning trees and forests) for a separate companion paper. 
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A Computational Complexity 

A.l Description of an Alternative Clause Gadget 


In this section we describe a clause gadget that can be used as an alternative to the 
one presented in Section]^ Since it consists only of roads that are paths, this gadget 
strengthens Theorem [T] 

Theorem 4. For a given road map M. and an integer K it is NP-hard to decide if in 
total at least K road sections can be identified, even if all roads are paths. 


The clause gadget consists of ten roads, r, pa, Pb, Pc ,cii, h and Ci with i G {1, 2} 
that all are paths; see Fig.[^ Going along r from one end to the other, the junctions with 
the roads a^, bi and (1 < i < 2) occur in three densely packed blocks. The blocks are 
described by the sequence of roads intersecting r. The first block is Ba = (oi, C 2 , &i, 02 )^ 
the second block is Bb = ( 02 , ci, 62 ) and the third block is Be = ( 62 , ci, C 2 , ai). The 
label length of r is chosen so that at most three labels can be placed on r, but each road 
section is shorter than a label of r. Choosing the length of the road sections appropriately, 
we further ensure that we can place a label that crosses all junctions of one of the blocks 
without crossing the junctions of another block. 

We now describe junctions of the roads pa, Pb, Pc ,(ii, bi and Ci with i G {1, 2}. The 
road oi hrst intersects pa and then r twice. Let and denote these road 

sections in that particular order. The length of is chosen so that a single label can be 
placed on , while the others are shorter than the label length of oi. More specihcally, 
we define oi’s label length such that a label identifies the sections in either 






2 3 

ai 7 ^ai 7 


,} or {s 


3 

a\ 5 ai 


}. We define the intersections 


and the label length for 02 , analogously. Further, pa intersects oi and 02 in one junction, 
i.e., the edge of pa connecting both junction vertices is a junction edge. The label length 
of Pa is chosen so that a label can cross pa’s only junction. The length of Pa’& road 
sections is at least as long as p^’s label length. We call pa a gate, because later this road 
will be connected to the end road of a chain by a junction; see violet square in Fig. |15(aj 
For 61 , & 2 , Cl, C 2 we introduce analogous junctions and road sections, however, bi and 
62 intersect pb instead of pa, and ci and C 2 intersect pc instead of pa- 

In order to identify both road sections of a gate, either two labels can be placed on 
the road sections separately, or one label that goes through the junction. In the former 
case the gate is open and in latter case it is closed', see Fig. |15(b)| We observe that it 
only makes sense to close a gate, if at least one road section of the gate does not allow to 
place a label that is only contained in that road section. This case will occur if and only 
if the connected chain transmits the value false to the clause. 

Assume that at least one gate is open, i.e., one literal of the clause is true; see 
Fig. 15(b) Without loss of generality let pa be open. We place a label £r on r such 
that it crosses the junctions of block Ba and identihes 5 sections. Since pa is open, we 
can place a label that identifies and . Analogously, we can place a label 
identifying and Placing further labels as indicated in Fig. 15(b) we identify five 
road sections of r and all road sections of any other road except for . Hence, 33 

road sections are identified. 
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C2 Cl 

(a) 




Fig. 15. Illustration of alternative clause gadget, which only uses paths as roads. [(^Structure of 
the clause gadget |(b)| Optimal labeleling for the case that at least one literal is true |(c)| Optimal 
labeling for the case that all literals are false. 


We observe that we can place the labels of bi, 62 , ci, ci such that they do not cross 
the junctions of gi, and gi-, respectively. Hence, it does not matter whether gi, and g^ are 
closed or open, i.e., it does not matter whether the corresponding literals are true or false. 

We now argue that this is an optimal labeling. If or were labeled, the label ir 
must be placed such that the junctions of r with C 2 and bi are not crossed, respectively. 
This decreases the number of identified road sections as least as much identifying 
and increases the number of identified road sections. In order to identify at least one 
of the unidentified road sections of r, we need to place a label that crosses Bf, or B^. 
Obviously, this yields a smaller number of identified road sections than 31. 

Finally, assume that all gates are closed; see Fig. 15(c) Consider, the same labeling 
as before. However, this time we cannot label and anymore. Hence, this labeling 
has only 29 identified road sections. Obviously, it cannot be improved by changing the 
placement of the remaining labels or adding labels. 















































